Restricted Value Iteration: Theory and Algorithms

نویسندگان

  • Nevin Lianwen Zhang
  • Weihong Zhang
چکیده

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Providing an algorithm for solving general optimization problems based on Domino theory

Optimization is a very important process in engineering. Engineers can create better production only if they make use of optimization tools in reduction of its costs including consumption time. Many of the engineering real-word problems are of course non-solvable mathematically (by mathematical programming solvers). Therefore, meta-heuristic optimization algorithms are needed to solve these pro...

متن کامل

Application of variational iteration method for solving singular two point boundary value problem

DEA methodology allows DMUs to select the weights freely, so in the optimalsolution we may see many zeros in the optimal weight. to overcome this prob-lem, there are some methods, but they are not suitable for evaluating DMUswith fuzzy data. In this paper, we propose a new method for solving fuzzyDEA models with restricted multipliers with less computation, and comparethis method with Liu''''''...

متن کامل

Dhage iteration method for PBVPs of nonlinear first order hybrid integro-differential equations

In this paper, author proves the algorithms for the existence as well as the approximation of solutions to a couple of periodic boundary value problems of nonlinear first order ordinary integro-differential equations using operator theoretic techniques in a partially ordered metric space. The main results rely on the Dhage iteration method embodied in the recent hybrid fixed point theorems of D...

متن کامل

Acceleration Operators in the Value Iteration Algorithms for Average Reward Markov Decision Processes

One of the most widely used methods for solving average cost MDP problems is the value iteration method. This method, however, is often computationally impractical and restricted in size of solvable MDP problems. We propose acceleration operators that improve the performance of the value iteration for average reward MDP models. These operators are based on two important properties of Markovian ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2005